Preparing Data Sets for the Data Mining Analysis using the Most Efficient Horizontal Aggregation Method in SQL
نویسندگان
چکیده
A huge amount of time is needed for making the dataset for the data mining analysis because data mining practitioners required to write complex SQL queries and many tables are to be joined to get the aggregated result. The traditional SQL aggregations prepare the data sets in vertical layout that is; they return result on one column per aggregated group. But for the data mining project, the data set to be required in horizontal layout. In order to transform the data into suitable form the existing three horizontal aggregation methods are used. The existing method for evaluating horizontal aggregation are SPJ (select, project, join) method, CASE method and PIVOT method. The analysis become more efficient if the dataset obtained is in the horizontal form. The main aim is to identify the most efficient method from these three methods in terms of time and space complexity. So these methods are compared using large tables and identified that the CASE method is more efficient than SPJ and PIVOT method. General Terms SQL Queries, Data Base Management, Data Mining
منابع مشابه
Improving Analysis Of Data Mining By Creating Dataset Using Sql Aggregations
In Data mining, an important goal is to generate efficient data. Efficiency and scalability have always been important con-cerns in the field of data mining. The increased complexity of the task calls for algorithms that are inherently more expensive. To analyze data efficiently, Data mining systems are widely using datasets with columns in horizontal tabular layout. Preparing a data set is mor...
متن کاملA Better Approach for Horizontal Aggregations in SQL Using Data Sets for Data Mining Analysis
To analyzing the data efficiently in Data mining systems are widely using datasets with columns in horizontal tabular layout. Generally preparing a data set is the more complex task in a data mining project, require many complex SQL queries, aggregating columns and joining tables. Conventional RDBMS usually manage tables with vertical form. Aggregated columns in a horizontal tabular layout retu...
متن کاملK Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizontal Aggregations
Data mining refers to the process of analyzing the data from different perspectives and summarizing it into useful information that is mostly used by the different users for analyzing the data as well as for preparing data sets. A data set is collection of data that is present in the tabular form. Preparing data set involves complex SQL queries, joining tables and aggregate functions. Tradition...
متن کاملMulti Dimensionalised Aggregation on Horizontal Datasets Using SAAS In Data Mining Domain
Projecting data in different dimensions is the core concept taken for this project. Preparing a data set for analysis is generally the most time consuming task in a data mining project. In the existing system they used simple, yet powerful, methods to generate SQL (Structured Query Language) code to return aggregated columns in a horizontal tabular layout, returning a set of numbers instead of ...
متن کاملPrepare and Optimize Data Sets for Data Mining Analysis
Getting ready a data set for examination is usually the tedious errand in a data mining task, needing numerous complex SQL queries, joining tables and conglomerating sections. Existing SQL aggregations have limitations to get ready data sets since they give back one section for every amassed bunch. As a rule, a significant manual exertion is obliged to construct data sets, where a horizontal la...
متن کامل